AITopics | voice input

Collaborating Authors

voice input

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GenAI Voice Mode in Programming Education

Jacobs, Sven, Kiesler, Natalie

arXiv.org Artificial IntelligenceNov-19-2025

Real-time voice interfaces using multimodal Generative AI (GenAI) can potentially address the accessibility needs of novice programmers with disabilities (e.g., related to vision). Yet, little is known about how novices interact with GenAI tools and their feedback quality in the form of audio output. This paper analyzes audio dialogues from nine 9th-grade students using a voice-enabled tutor (powered by OpenAI's Realtime API) in an authentic classroom setting while learning Python. We examined the students' voice prompts and AI's responses (1210 messages) by using qualitative coding. We also gathered students' perceptions via the Partner Modeling Questionnaire. The GenAI Voice Tutor primarily offered feedback on mistakes and next steps, but its correctness was limited (71.4% correct out of 416 feedback outputs). Quality issues were observed, particularly when the AI attempted to utter programming code elements. Students used the GenAI voice tutor primarily for debugging. They perceived it as competent, only somewhat human-like, and flexible. The present study is the first to explore the interaction dynamics of real-time voice GenAI tutors and novice programmers, informing future educational tool design and potentially addressing accessibility needs of diverse learners.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3769994.3770001

2509.10596

Country:

Europe > Germany (0.68)
North America > United States > New York (0.16)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum (0.69)
Education > Educational Setting > K-12 Education > Secondary School (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.56)

Add feedback

Microsoft adds AI voice chat to Bing on desktop

EngadgetJun-10-2023, 12:30:55 GMT

You can now talk to Bing on desktop, and it can even read its replies to you out loud. Microsoft has rolled out voice support for the search engine's chatbot on Edge for PCs, which is powered by OpenAI's GPT-4 technology. "We know many of you love using voice input for chat on mobile," the tech giant wrote in its latest Bing preview release notes. The feature first became available on Bing's AI chatbot for its mobile apps. Now it's also available on desktop -- you just need to tap on the mic icon in the Bing Chat box to talk to the AI-powered bot.

ai voice chat, desktop, microsoft, (5 more...)

Engadget

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.78)

Add feedback

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Liang, Dawei, Shi, Yangyang, Wang, Yun, Singhal, Nayan, Xiao, Alex, Shaw, Jonathan, Thomaz, Edison, Kalinli, Ozlem, Seltzer, Mike

arXiv.org Artificial IntelligenceOct-7-2021

Detection of common events and scenes from audio is useful for extracting and understanding human contexts in daily life. Prior studies have shown that leveraging knowledge from a relevant domain is beneficial for a target acoustic event detection (AED) process. Inspired by the observation that many human-centered acoustic events in daily life involve voice elements, this paper investigates the potential of transferring high-level voice representations extracted from a public speaker dataset to enrich an AED pipeline. Towards this end, we develop a dual-branch neural network architecture for the joint learning of voice and acoustic features during an AED process and conduct thorough empirical studies to examine the performance on the public AudioSet [1] with different types of inputs. Our main observations are that: 1) Joint learning of audio and voice inputs improves the AED performance (mean average precision) for both a CNN baseline (0.292 vs 0.134 mAP) and a TALNet [2] baseline (0.361 vs 0.351 mAP); 2) Augmenting the extra voice features is critical to maximize the model performance with dual inputs.

event detection, voice branch, voice input, (16 more...)

arXiv.org Artificial Intelligence

2110.03174

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Hacking Artificial Intelligence – Influencing and Cases of Manipulation

#artificialintelligenceMar-7-2020, 20:40:40 GMT

Because the method has taken on such a central role, it is creating some major risks. This article discusses the extent to which AI can be hacked. The discussion presented here applies to both strong and weak AI applications in equal measure. In both cases, input is collected and processed before an appropriate response is produced. It doesn't matter whether the system is designed for classic image recognition, a voice assistant on a smartphone, or a fully automated combat robot.

application, voice assistant, vulnerability, (15 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.97)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.52)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.37)
Information Technology > Communications > Mobile (0.35)

Add feedback

Researchers Created AI That Hides Your Emotions From Other AI

#artificialintelligenceAug-26-2019, 20:33:14 GMT

Humans can communicate a range of nonverbal emotions, from terrified shrieks to exasperated groans. Voice inflections and cues can communicate subtle feelings, from ecstasy to agony, arousal and disgust. Even when simply speaking, the human voice is stuffed with meaning, and a lot of potential value if you're a company collecting personal data. Now, researchers at the Imperial College London have used AI to mask the emotional cues in users' voices when they're speaking to internet-connected voice assistants. The idea is to put a "layer" between the user and the cloud their data is uploaded to by automatically converting emotional speech into "normal" speech.

artificial intelligence, chatbot, natural language, (13 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.57)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)

Add feedback

Researchers Created AI That Hides Your Emotions From Other AI

#artificialintelligenceAug-24-2019, 18:41:47 GMT

emotional state, speech, speech engineer, (9 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.69)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.57)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.54)

Add feedback

Emotionless: Privacy-Preserving Speech Analysis for Voice Assistants

Aloufi, Ranya, Haddadi, Hamed, Boyle, David

arXiv.org Machine LearningAug-9-2019

Voice-enabled interactions provide more human-like experiences in many popular IoT systems. Cloud-based speech analysis services extract useful information from voice input using speech recognition techniques. The voice signal is a rich resource that discloses several possible states of a speaker, such as emotional state, confidence and stress levels, physical condition, age, gender, and personal traits. Service providers can build a very accurate profile of a user's demographic category, personal preferences, and may compromise privacy. To address this problem, a privacy-preserving intermediate layer between users and cloud services is proposed to sanitize the voice input. It aims to maintain utility while preserving user privacy. It achieves this by collecting real time speech data and analyzes the signal to ensure privacy protection prior to sharing of this data with services providers. Precisely, the sensitive representations are extracted from the raw signal by using transformation functions and then wrapped it via voice conversion technology. Experimental evaluation based on emotion recognition to assess the efficacy of the proposed method shows that identification of sensitive emotional state of the speaker is reduced by ~96 %.

artificial intelligence, machine learning, speech, (16 more...)

arXiv.org Machine Learning

1908.03632

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)

Add feedback

Microsoft patent suggests you whisper to your voice assistants

EngadgetJan-3-2019, 16:20:26 GMT

While voice assistants have grown in popularity over recent years, many people still hesitate to use them in public spaces, and that's a problem Microsoft is looking to tackle. In a patent filing, the company notes that for a number of reasons -- not wanting to disturb those nearby, not wanting to share private information around strangers -- people often avoid issuing voice commands when in public. "Although performance of voice input has been greatly improved, the voice input is still rarely used in public spaces, such as office or even homes," says the patent filing. "These are not technical issues but social issues. Hence there is no easy fix even if voice recognition system performance is greatly improved."

artificial intelligence, speech recognition, voice assistant, (9 more...)

Engadget

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.97)

Add feedback

Amazon wants to turn Alexa into a makeshift doctor

Daily Mail - Science & techOct-16-2018, 11:14:42 GMT

Alexa may soon be able to act as an in-house doctor for poorly or upset users. A patent that was filed by Amazon reveals that Alexa will automatically detect unusual changes in a person's voice and speaking patterns. The AI-powered smart speaker will also pick up on auditory clues like coughs and moans then offer suggestions to held aid a speedy recovery. These could include suggesting you eat a bowl of chicken soup as well as offering to deliver cough tablets, tissues and play you soothing music. Amazon has successfully obtained a patent which would allow Alexa to detect unusual changes in a person's voice caused by illness or crying.

amazon, artificial intelligence, child version, (13 more...)

Daily Mail - Science & tech

Country: North America > United States > Louisiana (0.16)

Industry:

Health & Medicine (0.50)
Government > Regional Government > North America Government > United States Government (0.33)
Law > Intellectual Property & Technology Law (0.32)

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

Amazon Files for Patent to Detect User Illness and Emotional State by Analyzing Voice Data - Voicebot

#artificialintelligenceOct-14-2018, 20:04:04 GMT

Amazon yesterday filed a patent with the U.S. Patent and Trademark Office related to detecting physical and emotional wellbeing of users based on interactions captured in voice data. The first example in the patent application depicts a user coughing while asking Alexa about being hungry. Alexa responds by suggesting a chicken soup recipe and when refused then offers to order cough drops with one-hour delivery. The voice recognition system is using sounds such as a cough or sniffle to determine if a user is unwell. However, the patent is not limited by these sounds and could be extended to different types of normal speech.

artificial intelligence, natural language, user illness and emotional state, (14 more...)

#artificialintelligence

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.57)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.57)

Add feedback